SURVIVAL HISTORY This File is a brief summary of changes and enhancements to Survival since it came to light for the very first time. SURVIVAL 6.0.2 ĄMedian of survival with 95% IC is now estimated and printed for the different groups. ĄNow if you double-click on a Text File or a Binary File created by the program the application is launched an the selected file is automatically loaded and opened. ĄIf you try to select a variable that has the same value than the Status variable or the Time variable you get three beeps and the corresponding box flashes to warn you up. Ą If you press the Option key down while selecting the Help item from the Apple menu, a full description of the available macro commands is displayed in the Help window, which may be printed with the Print option of the File Menu. If you don't press the Option key a Help window with a general description about the program and the different Menu options is displayed. The Help window and Macro window are displayed now in styled Text for easy reading and its content updated to reflect the changes introduced in the newer versions. Ą If the Option key is pressed while launching the application, a macro file is loaded in memory and the macro located in second position in the macro menu is automatically executed. The first position is left to allow the user to define a function for time- dependent survival analysis. The file should be named 'Survival Macros' and must be located in the same folder than the application. If you don't press the Option key the file is just loaded to be used from the Macros Menu. This option allows to link Survival with your current database for fully automated survival analysis. Ą The MAKEBINS macro now lists the number of cases in each bin. Ą if the CONTROL key is pressed while selecting a covariate in the variable window of the DEFINE MODEL dialog a (c) sign appears in the rightmost side of the variable in the model window. This option ENABLES the internal transformation of categorical variables into K-1 INDICATOR or DEVIATION variables, where K is the max number of categories of the selected variable. If you want to use coding DEVIATION coding scheme, uncheck the 'USE INDICATOR VARIABLES' box. You may obtain the same results with the TRANSFORM option of the MODEL menu but you have to manuallay include the different variables in the model. In the output window the coefficientes are labeled as VAR1-1, VAR1-2, etc. or as NAME-1, NAME-2, NAME-3 ... etc if you checked the LABEL VARIABLES option. *** VERY IMPORTANT -------------- All the variables selected as categorical (Control Key Down) must be done in ASCENDING ORDER. If you try to select,for example, variable 6 if you previously selected a variable whose column number es equal or less than variable 6 the program beeps, the model window will be CLEARED and You'll have to start the variables selection again. This rule is also applied when in a model you combine categoricial with non-categorical variables. Non-categorical variables alone may be selected in any order *** I was compelled to add this option to my program because a common error I observed in Survival analysis is the use of categorical variables 'as is' as covariates in regression models. You should be aware that if you use this type of variables you are making 'a priori' assumption of a lineal dependence of the hazard function with the different values of the categorical variable, a fact that most of the times is wrong (I repeat wrong), unless otherwise demonstrated. If you use the INDICATOR coding scheme you compare the effect of a particular category with some other category taken as reference (the first one in our case). The value and sign of the new coefficients will tell you if that particular category increases (positive values) or decreases (negative values) the chance of survival.The value of the coefficients for the reference category (the first one) is 0 and is not displayed. If you use the DEVIATION coding scheme you compare a particular category with the average effect of all categories. The coefficient for the reference category (first) is calculated as the negative value of the sum of all coefficients for that category. Use the TEST OF HYPOTHESIS option to assess the joint effect upon hazard of the differente categories of the transformed variables or the different combinations of the covariates that you consider significant in your particular model. ** YOU CAN'T IMAGINE HOW MUCH YOUR RESULTS AND CONCLUSIONS MAY CHANGE IF YOU USE AN INAPPROPRIATE CODING OF YOUR VARIABLES WITH COX'S OR PARAMETRIC REGRESSION ANALYSIS ** HERE ARE MY RECOMENDATIONS: 1) USE BINARY VARIABLES (0,1) WHENEVER POSSIBLE. 2) NEVER USE CATEGORICAL VARIABLES AS COVARIATES. TRANSFORM THEM INTO INDICATOR OR DEVIATION VARIABLES WITH THE ABOVE OPTIONS. 3) IF YOU USE NUMERICAL VARIABLES, MAKE AN ALTERNATIVE ANALYSIS BY TRANSFORMING THEM INTO DIFFERENT GROUPS OR CATEGORIES WITH THE 'MAKEBINS' MACRO (You'll love it) AND THEN TRANSFORMING THE CATEGORIES INTO INDICATOR OR DEVIATION VARIABLES. (Many times the effect of numerical variables upon the chance of survival is only significant for a limited interval of values, for example, between 60-70 years but not below or above that AGES. If you obtain a significant P value for a numerical variable and you don't explore this possibility you would come to the conclusion, otherwise wrong, that the chance of survivaL decreases or increases for decreasing or increasing age values). 4) I PERSONALLY PREFER THE DEVIATION CODING SCHEME. DONT ASK ME WHY 5) NEVER BECOME A SLAVE OF THE 'P < 0.05' EXPRESSION OR LET IT BE A SUBSTITUTE OF YOUR NEURONS OR THE COMMON SENSE (WHICH, UNFORTUNATELY, IS LACKING IN MEDICINE IN THE LAST TIMES). 6) MODELLING THE RISK IS THE BEST CHALLANGE AND THE BEST WAY TO TEST YOUR KNOWLEDGE ON A PARTICULAR CLINICAL INVESTIGATION PROBLEM. USE THE LIKELIHOOD RATIO TEST TO ASSESS IF A GIVEN COVARIATE IN A MODEL WITH NO SIGNIFICANT P VALUE SHOULD BE LEFT OR REMOVED FROM THE MODEL. 7) STICK ALWAYS TO THE BASIC PRINCIPLE OF SCIENCE: THE 'PARSIMONY', AND TRY TO DESCRIBE YOUR PROBLEM WITH SIMPLE MODELS AND USING THE LEAST POSSIBLE NUMBER OF VARIABLES. 8) REMEMBER THAT 'CHAOS THEORY' STATES THAT COMPLEX AND NON-LINEAL SYSTEMS (AS IT HAPPENS IN MOST BIOLOGICAL SYSTEMS) ARE UNPREDICTABLE. THEREFORE, DON'T THINK THAT YOUR PARTICULAR MODEL MAY BE USEFUL FOR EXACT PREDICTIONS AND THAT MINIMAL CHANGES (THE BUTTERFLY EFFECT) MAY RADICALLY CHANGE THE RESULT OF YOUR PREDICTIONS. New macros added to version 6.0.2 : Ą SURVPLOT; activates the plot of both Kaplan-Meier and Model Survival function. It must be used before the ESTIMATE macro command. This macro substitutes the GRAPHPLOT('true','false') command of previous versions. Ą LOGPLOT; activates the plot of Log -(Log) of Survival function, which is used for testing the proportionality assumption.It must be used after the ESTIMATE macro command Ą INTPLOT(number); actives the plot of the Confidence Intervals for the current survival function. It must be used after the ESTIMATE command. 'number' is 1 for 95% CI and 2 for 99% CI. Ą RESPLOT; activates the plot of the Cumulative Function for Residuals, which is used to Test the goodness of fit of data to model.It must also be used after the ESTIMATE macro command. Ą SAVEPICT('name') saves the picture displayed in the active plot window to disk in the same folder than the application. This command should be used after the SURVPLOT, INTPLOT, LOGPLOT and RESPLOT macro commands, if you pretend to make different survival analysis with a single macro and want to save the graphical output.This macro also closes the active Plot window after saving to the disk. If you use an empty string as the 'name' argument, until you start a new analysis the graphic plots ar saved as Kaplan-Meier.PICTn, Model.PICTn, LogLog.PICTn, Intervals.PICTn and Residuals.PICTn, where 'n' is automatically increased for each saved file of the same graphic type. Use a Path if you want the pictures saved in a different folder This option is not available with the Macro Command submenu. Ą RUNMACRO(number); excutes the macro located in the position 'number' in the Macro Menu. Useful for linking different macros in complex automated survival analysis. Read carefully the macros in the 'Survival Macros' to learn how to use these commands. Ą SORTBY(column1,column2); sorts data array in ascending order by 'column1' and 'column2'. If you put a zero for the value in 'column2' the data array is only sorted by the ascending order of values in 'column1'. Ą CLEAR; clears the active output window. This option is not available with the Macro Command submenu for System versions less than 7.xx. Ą FIXDATA; makes permanente the changes in data array. You need to reload your data file if you want your original data back in memory. Ą CATEGORIES(Var1,var2,..option). Indentifies the categorical variables in the data array included in the model that have to be transformed into DEVIATION (option = 0) or INDICATOR (option = 1) variables. STATISTICAL MACROS: This macros help a lot for an on line assessment of summary statistics of our data array in memory. Using these macros with standar Pascal procedures you may even compute other statistical Tests and compute the exact probability associated to the test value. Ŕ DO YOU DARE ? ĄSTAT(var,var, ...) computes descriptive statistics (average, SD, SE, Min,Max and Range) for the specified 'vars'. If you use a minus sign as argument (-) the statistics for all variables in the active data array are computed. ĄCHITEST(var1,var2); computes a Chi-Square Test for categories in the 'var1' column of the data array by the categories in the 'var2'column of the data array. Results are presented in a contingency table where the rows are the max number of categories in 'var1' and the columns the max number of categories in 'var2' The maximum number of categories is seven and the categories should be numerated from 1 ....... n categories. ĄTTEST(var1,var2,[Var3]); computes a Student T-Test betwen Var1 and Var2. If you use the same value for Var1 and Var2 and use Var3, which is optional, a T-Test is computed for the groups in Var3. The number of groups must be two (use the RECODE or COMPUTE MACRO if necessary) otherwise you'll get an error message. ĄCHIPROB(chi,df); computes the exact P value for ChiSquare distribution 'chi' with 'df' degrees of freedom. ĄFPROB(F,df1,df2); computes the exact P value for F distribution 'F' with 'df1' and 'df2' degrees of freedom. ĄNORMPROB(Z); computes the exact P value for Normal distribution 'Z'. ĄTPROB(T,df); computes the exact P value for Student T distribution 'T' with 'df' degrees of freedom. You may acces the numerical results of these Tests for further processing within a standard pascal procedure or macro with the TEMP[] Token as follows: ============================================================================== Temp[1] Temp[2] Temp[3] Temp[4] Temp[5] Temp[6] -CHITEST Chi-Value P-value df - - - -TTEST T-Value P-Value df -CHIPROB Chi-Value P-Value df - - - -NORMPROB Z-Value P-Value - - - - -TPROB T-Value P-Value df - - - -FPROB F-Value P-Value df1 df2 - - -STAT(x) Mean(x) SD(x) SE(x) Min(x) Max(x) Range(x) ------------------------------------------------------------------------------Example if Temp[2] < 0.05 the RUNMACRO(2) else EXIT; Bug corrections Ą a bug that prevented to save the model when the number of selected vars was ONE has been fixed Ą a bug that bombed when trying to save the Data array as 'Application File' under system 6.7x has been fixed. Survival Version 6.1.1 (January, 1996) This version supports Parametric Survival Analysis with Exponential, Weibull and Log-Normal models. Survival Version 6.1.2 (March, 1996) Ą Plot of Residuals, 95% I.C. Intervals and Survival Functions (Survival, Density and Hazard) have been implemented for Parametric Survival analysis. Ą The Survival Function for a given Parametric is plotted now by default for the mean covariate value if you don't specifiy a covariate pattern. This avoids the ragged aspect of the graphics in Version 6.1.1. ĄA new macro for automated survival analysis with parametric models has been added. Its sintax is: PARAMETRIC('ModelType',option1,option2,option3,sigma); 'Model Type is the the type of Parametric Model: ĄSet 'ModelType' to 'exponential' to fit data to a exponential model ĄSet 'ModelType' to 'weibull' to fit data to a two Parameter Weibull model ĄSet 'ModelType' to 'lognormal' to fit data to a Log-Normal model 'option1', 'option2' and 'option3' are boolean expressions: ĄSet 'option1' to true if you want to use a constant term. This option should always b true if you include covariates in the model ĄSet 'option2' to true to center the covariates values around their mean value ĄSet 'option3' to true if you want the iterations to be printed in the output window 'sigma' is the initial value for estimation of the model's sigma term. Should be set to 1 unless you have convergence problems Example: PARAMETRIC('weibull',true,false,false,1); This macro command fits your survival data to a parametric Weibull model with a constant term and initial sigma value = 1 You have to define the status and time variable and the covariates (if any) as in any Cox regression analysis before the PARAMETRIC macro. If you don't include covariates the survival function for the null model will be computed. See examples in the Survival Macros ĄCosmetic changes have be done in some Dialog and Alert Windows and minor typhos have been corrected. ĄPlot of residuals and Survival Functions have been changed. Now individual points are plotted insted of a continuos line. Survival version 6.1.3 (June 96) ĄThe user interface has been modified to make it more friendly, easier to use and fully addapted to the Mac User's Interface specifications. Ą Now PopUp menus are used within the Define Model Dialog Box: Ą To control the numerical output to the active Outpout Window Ą To chose the different Time Units Ą To Export the diferente survival functions and the current survival model Ą To edit the name of the selected covariate labels, group labels, and to enter the values for a given covariate pattern and initial coefficient estimates. Ą Labels for the variables can be included in the data file. You may use a Tab delimited text file or a binary file. In the first instance follow this steps: - include the variables labels in the first row of the data file delimited with Tabs; they must match the number of variables. The program looks for data starting with an alphabetic character in the first row and convert them to labels. You may use also use numeric characters and the symbols '%, $, #, /, -, _, &, *'. Notice that he period "." symbol can't be used within a label. The name of the label is truncated to the first 8 characters (Example: Age, Stage_1, Grade$, etc). - Save the File as a binary File. The variable's labels are included as a LABL resource in the resource fork of the data file. In the second instance follow this steps: - Load a Text File or a Binary File - From the Macro Command window issue the following macro: varlabel('label1', label', 'label3', ....etc) Label must be entered as quoted strings and should match the number of variables in the data file; otherwise you'll get an error message. Save the new data File with the Save File as option from the File Menu If you want to change the name of the variables included in a model, chose Edit Covariate Labels from the PopUp menu in the Define Model dialog box. Now when you list the variables in the variables window of the Define model dialog box they will be shown with their labels and in all subsequent outputs as well. Ą Options have been added in the File menu to Delete a File or to add an Info window to to the current file. The information is saved as an INFO resource in the resource fork of the Data File and its length is limited to 256 characters. Ą Now you can chose the value to be used as reference when transforming categorical variables into indicator or deviation variables. Just put the value you want in the edit field to the right of the Use Indicator Variable check box before selecting the variable you want to transform from the variables window (remember to press the Option Key Down). Ą An option has been added to append to the data file in memory the values of the residuals obtained in the last analysis when using parametric models. To keep this values Save again the data file with the same name or a different name as a binary file or a tab delimited text file. If you use labels for the variables, residuals will be added to the last column of the data file under the label 'RESIDS' Ą Now you can make bivariate plots of the variables in the current data file. Select this option from the Options item of the File Menu. Uncheck the Zero Value for X-Y origin box from the dialog window if you plot residuals, since they have both postive and negative values; otherwise negative values would be excluded from the plot. Ą Very Important. Use the command STAT(*) to display the summary statistics for all variables in the current data file insted of STAT(-). ******************************************************************** I CONSIDER VERSION 6.1.3 AS THE END PRODUCT. GOD KNOWS THAT IT COSTED ME SWEATS AND BLOOD AND LOTS OF TIME AND FRUSTATION, BUT I'LL BE HAPPY IF IT IS USEFUL EVEN FOR A SINGLE PERSON. I'M OPENED TO SUGGESTIONS AND CONTRIBUTIONS TO ENHANCE THE PROGRAM, PROVIDED THAT IT HAS SOME REASONABLE LEVEL OF DIFFUSION IN THE NET (WHO KNOWS ?) A FULL TUTORIAL WITH THE BASICS OF SURVIVAL ANALYSIS WITH PARAMETRIC AND NOT PARAMETRIC MODELS IS UNDER DEVELOPMENT AND ADDRESSED TO PEOPLE (LIKE ME 10 YEARS BEFORE) WITH MINIMAL MATHEMATICAL AND STATISTICAL BACKGROUND. IT WILL BE POSTED TO SUMEX AS SURVIVAL.TUTORIAL.HQX OR INCLUDED IN A PACK WITH THE NON FPU VERSION OF THE PROGRAM FOR THE POWER MAC USERS. ! PLEASE TRASH ALL PREVIOUS VERSIONS ! ******************************************************************** Report any problem or bug you may detect to: Manuel Urrutia Avisrror, M.D., P.H.D Catedratic of Urology University of Salamanca - Department of Urology Salamanca - SPAIN e-mail: urrutia@gugu.usal.es